skip to main content


Search for: All records

Creators/Authors contains: "Zeng, Donglin"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract

    We propose a test-based elastic integrative analysis of the randomised trial and real-world data to estimate treatment effect heterogeneity with a vector of known effect modifiers. When the real-world data are not subject to bias, our approach combines the trial and real-world data for efficient estimation. Utilising the trial design, we construct a test to decide whether or not to use real-world data. We characterise the asymptotic distribution of the test-based estimator under local alternatives. We provide a data-adaptive procedure to select the test threshold that promises the smallest mean square error and an elastic confidence interval with a good finite-sample coverage property.

     
    more » « less
  2. Abstract

    Complementary features of randomized controlled trials (RCTs) and observational studies (OSs) can be used jointly to estimate the average treatment effect of a target population. We propose a calibration weighting estimator that enforces the covariate balance between the RCT and OS, therefore improving the trial-based estimator's generalizability. Exploiting semiparametric efficiency theory, we propose a doubly robust augmented calibration weighting estimator that achieves the efficiency bound derived under the identification assumptions. A nonparametric sieve method is provided as an alternative to the parametric approach, which enables the robust approximation of the nuisance functions and data-adaptive selection of outcome predictors for calibration. We establish asymptotic results and confirm the finite sample performances of the proposed estimators by simulation experiments and an application on the estimation of the treatment effect of adjuvant chemotherapy for early-stage non-small-cell lung patients after surgery.

     
    more » « less
  3. Pharmacoinformatics research has experienced a great deal of successes in detecting drug‐induced adverse events (AEs) using large‐scale health record databases. In the era of polypharmacy, pharmacoinformatics faces many new challenges, and two significant challenges are to detect high‐order drug interactions and to handle strongly correlated drugs. In this article, we propose a super‐combo‐drug test (SupCD‐T) to address the aforementioned two challenges. SupCD‐T detects drug interactions by identifying optimal drug combinations with increased AE risks. In addition, SupCD‐T increases the statistical powers to detect single‐drug effects by combining strongly correlated drugs. Although SupCD‐T does not distinguish single‐drug effects from their combination effects, it is noticeably more powerful in selecting an individual drug effect in the multiple regression analysis, where confounding justification between two correlated drugs reduces the power in testing the individual drug effects on AEs. Our simulation studies demonstrate that SupCD‐T has generally better power comparing with the multiple regression analysis. In addition, SupCD‐T is able to select meaningful drug combinations (eg, highly coprescribed drugs). Using electronic health record database, we illustrate the utility of SupCD‐T and discover a number of drug combinations that have increased risk in myopathy. Some novel drug combinations have not yet been investigated and reported in the pharmacology research.

     
    more » « less
  4. Abstract Academic researchers, government agencies, industry groups, and individuals have produced forecasts at an unprecedented scale during the COVID-19 pandemic. To leverage these forecasts, the United States Centers for Disease Control and Prevention (CDC) partnered with an academic research lab at the University of Massachusetts Amherst to create the US COVID-19 Forecast Hub. Launched in April 2020, the Forecast Hub is a dataset with point and probabilistic forecasts of incident cases, incident hospitalizations, incident deaths, and cumulative deaths due to COVID-19 at county, state, and national, levels in the United States. Included forecasts represent a variety of modeling approaches, data sources, and assumptions regarding the spread of COVID-19. The goal of this dataset is to establish a standardized and comparable set of short-term forecasts from modeling teams. These data can be used to develop ensemble models, communicate forecasts to the public, create visualizations, compare models, and inform policies regarding COVID-19 mitigation. These open-source data are available via download from GitHub, through an online API, and through R packages. 
    more » « less
  5. Short-term probabilistic forecasts of the trajectory of the COVID-19 pandemic in the United States have served as a visible and important communication channel between the scientific modeling community and both the general public and decision-makers. Forecasting models provide specific, quantitative, and evaluable predictions that inform short-term decisions such as healthcare staffing needs, school closures, and allocation of medical supplies. Starting in April 2020, the US COVID-19 Forecast Hub ( https://covid19forecasthub.org/ ) collected, disseminated, and synthesized tens of millions of specific predictions from more than 90 different academic, industry, and independent research groups. A multimodel ensemble forecast that combined predictions from dozens of groups every week provided the most consistently accurate probabilistic forecasts of incident deaths due to COVID-19 at the state and national level from April 2020 through October 2021. The performance of 27 individual models that submitted complete forecasts of COVID-19 deaths consistently throughout this year showed high variability in forecast skill across time, geospatial units, and forecast horizons. Two-thirds of the models evaluated showed better accuracy than a naïve baseline model. Forecast accuracy degraded as models made predictions further into the future, with probabilistic error at a 20-wk horizon three to five times larger than when predicting at a 1-wk horizon. This project underscores the role that collaboration and active coordination between governmental public-health agencies, academic modeling teams, and industry partners can play in developing modern modeling capabilities to support local, state, and federal response to outbreaks. 
    more » « less
  6. Drug‐drug interactions (DDIs) are a common cause of adverse drug events (ADEs). The electronic medical record (EMR) database and the FDA's adverse event reporting system (FAERS) database are the major data sources for mining and testing the ADE associated DDI signals. Most DDI data mining methods focus on pair‐wise drug interactions, and methods to detect high‐dimensional DDIs in medical databases are lacking. In this paper, we propose 2 novel mixture drug‐count response models for detecting high‐dimensional drug combinations that induce myopathy. The “count” indicates the number of drugs in a combination. One model is called fixed probability mixture drug‐count response model with a maximum risk threshold (FMDRM‐MRT). The other model is called count‐dependent probability mixture drug‐count response model with a maximum risk threshold (CMDRM‐MRT), in which the mixture probability is count dependent. Compared with the previous mixture drug‐count response model (MDRM) developed by our group, these 2 new models show a better likelihood in detecting high‐dimensional drug combinatory effects on myopathy. CMDRM‐MRT identified and validated (54; 374; 637; 442; 131) 2‐way to 6‐way drug interactions, respectively, which induce myopathy in both EMR and FAERS databases. We further demonstrate FAERS data capture much higher maximum myopathy risk than EMR data do. The consistency of 2 mixture models' parameters and local false discovery rate estimates are evaluated through statistical simulation studies.

     
    more » « less